Fixes SWDEV-483388 #74

xinyazhang · 2024-10-24T19:15:22Z

GPU memory copy should use the provided stream.

…True

…rsion.

TedThemistokleous · 2024-10-25T02:19:39Z

onnxruntime/core/providers/migraphx/gpu_data_transfer.cc

@@ -57,7 +57,7 @@ common::Status GPUDataTransfer::CopyTensorAsync(const Tensor& src, Tensor& dst,
      HIP_CALL_THROW(hipMemcpyAsync(dst_data, src_data, bytes, hipMemcpyDeviceToDevice, static_cast<hipStream_t>(stream.GetHandle())));
    } else {
      // copy from other CPU memory to GPU, this is blocking
-      HIP_CALL_THROW(hipMemcpy(dst_data, src_data, bytes, hipMemcpyHostToDevice));
+      HIP_CALL_THROW(hipMemcpyWithStream(dst_data, src_data, bytes, hipMemcpyHostToDevice, static_cast<hipStream_t>(stream.GetHandle())));


Fix lint. Also, I thought we wanted to block in this case, but I suppose this allows only MIGraphX EP to block on stream now since this gpu_data_transfer isn't shared between both EPs.

TedThemistokleous · 2024-10-25T02:28:55Z

onnxruntime/core/providers/migraphx/migraphx_execution_provider.cc

@@ -1432,7 +1432,11 @@ Status MIGraphXExecutionProvider::Compile(const std::vector<FusedNodeAndGraph>&
            std::vector<int64_t> ort_shape{res_lens.begin(), res_lens.end()};
            auto output_tensor = ctx.GetOutput(i, ort_shape.data(), ort_shape.size());
            void* output_data = output_tensor.GetTensorMutableRawData();
-            HIP_CALL_THROW(hipMemcpy(output_data, gpu_res.data(), res_shape.bytes(), hipMemcpyDeviceToDevice));
+            HIP_CALL_THROW(hipMemcpyWithStream(output_data,


With MIGraphX we perform the run_async as part of the eval() which streams execution. Were you seeing stale data somehow with this memcopy for networks with many outputs? Tried this with Bert and didn't see a change in perf but curious as to why this changed.

TedThemistokleous · 2024-11-11T19:55:45Z

merging this as this was put in upstream

xinyazhang added 2 commits October 24, 2024 19:14

Fix SWDEV-483388: mistakenly invoke hipMemcpy when ORT_ENABLE_STREAM=…

6d039d2

…True

Fix another hipMemcpy

77ecac5

xinyazhang requested review from TedThemistokleous and causten October 24, 2024 19:15

Remove MIGRAPHX_STREAM_SYNC guard which does not exists in current ve…

7c73ec5

…rsion.

TedThemistokleous reviewed Oct 25, 2024

View reviewed changes

TedThemistokleous merged commit d9f0ef8 into rocm6.3_internal_testing Nov 11, 2024
10 of 15 checks passed

TedThemistokleous assigned xinyazhang Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixes SWDEV-483388 #74

Fixes SWDEV-483388 #74

xinyazhang commented Oct 24, 2024

TedThemistokleous Oct 25, 2024

TedThemistokleous Oct 25, 2024

TedThemistokleous commented Nov 11, 2024

Fixes SWDEV-483388 #74

Fixes SWDEV-483388 #74

Conversation

xinyazhang commented Oct 24, 2024

TedThemistokleous Oct 25, 2024

Choose a reason for hiding this comment

TedThemistokleous Oct 25, 2024

Choose a reason for hiding this comment

TedThemistokleous commented Nov 11, 2024